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ABSTRACT 



The institutional research that has been carried out 
continually on many campuses and the kind of educational accounting 
that is being demanded of higher education are not one and the same. 
Neither is accountability synonymous with management information 
systems. This paper attempts to clarify the differences among 
evaluation in higher education, educational accounting, and 
management information systems. Evaluation is concerned primarily' 
with educational ef fee tlveness ; accountability is concerned with 
effectiveness and efficiency; and the management information system 
is the central feature of an accouTitability system. The paper also 
deals with some of the problems encountered in measuring educational 
impact, such as: (1) the problem of defining and assessing 

institutional goals; (2) the criterion problem and behavioral 
objectives in assessing college impact; (3) the lack of variance 
problem and the need for multiple criterion measures; and (4) the 
problem of inferring effects in naturalistic settings. <AF) 
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Tlie Need, for a Critical Loolc 

Almost all reasonable observers of American bigber education agree 
that tlie time has ai'i'ived —indeed has been with us right along, 
though too few have been aware of it — for higher education to take a 
close, careful, and critical look at itself. While it is true that there 
has always been the need for institutions to conduct ongoing ro- 
grams of S'^^lf-evaluation, the external pressures (that is, pressures 
from public officials, potential donors to the institution, tax=payers, 
and so forth) for colleges and universities to take stock of themsalveB 
IS greater now than perhaps ever before in the history of American 
higher education. 

There are many reasons for the increased demand for institutional 
self-scrutiny, of course. One of the most important, especially in the 
public sector, is the fantastic increase in consolidated systems of 
higher education in the past decade. It would appear that the crucial 
years were 1960 and 1961, when many states began to realize that 
voluntary planning and coordinating efforts were not guing to be suf- 
flcient to meet the challenges of the 1960s.* At that time several states 
either enacted legislation creating mandatory coordinating and plan- 
ning agencies or strengthened the power of existing ones. The trend 
was thus set in motion, and the implications for statewide evaluation 
and systematic accounting procedures were clear. Statewide plan- 
ning, if it vrore to be at all superior to the nearly autonomous develop- 
ment of institutions that preceded it, had to be based on more than 
pure fancy. Institutions v/ere now expected to justify their requests 
for money, approval fox^ new programs, and the like by facts about 
their institutions and their operations. Thus, even though ^institu- 
tional research” bad been around for a long time, it was not until the 
early 1960s that very many colleges and universities began to take it 
seriously. According to a survey conducted by Francis Rourke and 
Glenn Brooks, there were only 10 Institutions of higher education 
in the country boasting formal offices of institutional research prior 
to 1955, but by 1964 the number had swelled to 116.^ 

Closely related to the growth of multi-institutional coordinating 



1. Ernest G. Palola, Timothy Lehiinann, and William R. Blischke, )Hgher EdLucatiort by 
Design: The Sociology of Planning. University of California, Berkeley: Center for 
Research and Development in Higher Education, 1970, 

2. Francis E, riourke and Glenn B. Brooks, The Managerial Revolution in Higher E dbaea- 
tion. Baltimore: The Johns Hopkins Press, 1966, 184 pp. 
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agencies have been the increasing financial problems confronting 
higher education, a fiscal shortage of growing urgency in the past 
five years that has recently reached crisis proportions- According to 
a recent report by the Carnegie Commission on Higher Education, 
-^higher education has come upon hard times. The trouble is serious 
enough to be called a depression, The same study goes on to predict 
that, if the current trend continues, almost all higher educational 
institutions eventually will be in financial tHfflculty, Si^.pport for such 
a position is provided by a report from the American Association for 
Higher Education, which claims that not only is support for higher 
education descending rapidly but also that there is no indication of 
a let-up in the money squeeze for the next five years,^ 

The Credibility Cap and Demaxids for Educatlox*al *■ Aooountlng’* 

The reasons offered for the financial crisis in higher education are 
numerous and often interrelated, including the Vietnam war, a na- 
tional rearrangement of priorities (with greater attention going to 
poverty, racism, and ecological problems), increased enrollments, 
rising costs, and an overall steady decline of the American economy. 
Undoubtedly, however, one of the major causes of the current income 
shortage in higher education is what might be referred to a.s ‘-the 
credibility gap,^* a growing feeling of mistrust on the part of higher 
education's relevant publics (be they alumni, parents of school-age 
children, or whatever) about what higher education is doing or *-pro- 
ducing,” Such uneasy feelings have been nurtured, of course, by the 
rash of campus disturbs nces during the past few years, disturbances 
that have led to adverse reactions affecting both private and legis- 
lative support. It wo 17 Id probably be a mistake, however, to lay a 
disproportionate share of the blame for the ^‘credibility gap*' at the 
feet of the campus protestors,^ While they may have provided the 
observable stimulus for increased expressions of mistrust, it is prob- 
ably safe to say that higher educational institutions have long been 



3- E)arl F. Cheit, The Mew DepreBsion in Higher Education- New York: McGra^w-Hill* 
1971, p. 4. 

4. Robert T, Blackburn, “ChatigeB in Faculty Life Styles,” Research Report Mumher J- 
Wasbington, D.G*: American Association for Higher Education, undated, 4 pp, 

6. Or too much credit, either. It is ironic to note that the mistrust for higher education 
arising from campus disruptions Is to some extent a sign of the success of such demon- 
strations, for the purpose of many activist students is to highlight the lack of relevance 
and the worthlessness of higher education generally * 
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viewed with suspicion by many who have helped support them. Such 
misgivings are tolerable during periods when the economy is on the 
upswing. But dui'ing a questionable economy or a clear=cut recession 
it is understandable that money finds its way to those who can 
demonstrate that the money has been spent to the satisfaction of 
the giver. While better times would have been characterized by a sort 
of suspicious laissez-faire attitude toward higher education, there is 
now a demand for evidence that the large sum of money being spent 
on American higher education is being judiciously allocated. Concern 
about the costs of new educational programs, renewed interest in the 
costs of old programs^ questions about the need for annual faculty 
salarj' increases, and the legitimacy of the practice of tenure^ all 
these and more are being critically reappraised. At all levels and in 
various ways higher educational institutions are being called upon to 
‘■account'* for their programs and actions, just as other institutions 
or agencies are expected to justify their operations- College ad- 
ministrators, who have been allowed to luxuriate in the secrecy of 
their tasks, are now being pressured into a stance of openness. All 
who make claims for their “products" are asked to provide evidence 
to support their claims, and although there are numerous other 
reasons for institutions to study themselves carefully and systemati- 
cally, it is quite clear that flnancial stress is the most powerful per- 
suader. 

It Is also clear that the “institutional research** that has been 
carried out continually on many campuses and the kind of educational 
accounting that is being demanded of higher education now are not 
one and the same. In a broad sense, of course, they are both forms of 
educational evaluation, a practice that has been around for many 
years, but evaluation and “accountability** are not the same either, 
even though, again, the overlap between the concepts is substantial. 
Nor is accountability synonymous with “management Information 
systems,** ‘‘cost-benefit analysis,” or “program planning and budget- 
ing systems,** though all of these are interrelated. Consequently, it 
is imperative that the distinctions between and among these various 
concepts be clarified. 





Atl ^ttaxnpt at Som.e Ooxxcaptua.1 Cl^Fifloa.tloziLS 

Bva,liia.tioix ±n Hig^her Bd.UQa.tioxi 

Evaluation in higher education has traditionally been concerned with 
how well or to what degree speciflcajly defined objectives of a program 
(a curriculum^ a set of operating principles, or whatever) were at- 
tained. In a small percentage of cases the essential ingredients of 
such .an undertaking have been very much like those employed by 
a scientist ^social or other): (1) behaviorally deflned objectives, (2) the 
random assignment of subjects (usually educational eKperiences), 
(3) clearly differentiated treatments (such as different teaching tech- 
niques or other forms of curricular innovation), and (4) criterion 
measures chosen or developed on the basis of the behavioral objec- 
tives. Most programs in higher education, however, have not lent 
themselves to this experimental modeL Obviously, It is quite insensi- 
tive to most of the ^*real world’^ problems confronted In higher educa- 
tion, As one evaluator has remarked, “What does one do when not all 
the relevant objectives are manifested in directly observable specific 
individual behavior? What does one do about deliberately trying to 
measure effects that are not objectives of the program? What does 
one do when random assignment of subjects to treatments cannot be 
accomplished? What does one do when he lacks clearly differentiated 
treatments?'^® Because of concerns such as these, most educational 
evaluation has been based on a model that is both more comprehen- 
sive and more flexible- The two outstandiiig features of this modal 
have been, first, a concern with the question “What are the conse- 
€jti4s7dGeB of higher education?” (rather than the cbjectives), and, sec- 
ond, a style of inquiry that is more exploratory in nature (as opposed 
to tha ^experimental orientation of the classical model). The concern 
with tKe consequences of higher education stems from recognition 
that ’en;ain outcomes of higher education are often unintended (or 
at least not specifically stated) but still potentially important, and to 
ignore them simply because they were not acknowledged at the outset 
would be to neglect Important and illuminating information. The 
preference for a style of inquiry that Is exploratory in nature emerges 
from an awareness that higher educational institutions are not scien- 



6, O- Robert Pace, **An Evaluation of Higher Education: Plans and Pcrspectivas,” 
CSB MepoTt iVo- SI, Center for the Study of Evaluation. Los Angeles: tJCLA Graduate 
School of Fidueation, January, 1969, p, 2. 7 Mimeographed. 
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tific laboratories In which the various elements of the enterprise can 
be carefully controlled or manipulated to please the evaluator. Many 
institutions are continually changing their programs, toying with 
new approaches, and attempting to engender free environments. The 
exploratory style is typifiad by the comment, “The spirit of the eval- 
uator should be ad^^enturous. If only that which could be controlled 
or focused were evaluated, then a great many important educational 
and social developments would never be evaluated — at least not by 
-evaluators'; that would be a pity.''^ 

£jd.ucatioixal .A.coouiitliig 

‘‘Accountability" is the new “in" word in American education. The 
concept of educational accountability has been the subject of numer= 
ous symposia and special issues of educational journals, and certain 
forms of educational accountability have been brought to the atten- 
tion of the American public through popular accounts in the news= 
paper and other news media. It is a very sensitive concept, one that 
has been the center of much controversy at the elementary and 
secondary school levels. 

In many ways, educational accountability and educational evalua- 
tion are essentially the same. Accountability, like evaluation, is aimed 
at learning about the effect of educational institutions. Like evalua= 
tion, accountability is concerned with the effect of certain educational 
“treatments" (school experiences) on the students, after relevant 
characteristics of the students at the time the students entered col= 
lege are “controlled," The question “Are our institutions living up 
to their claims?" Is of primary concern to both evaluators and ac= 
countability experts. 

The differences between evaluation and accountability are less 
obvious, but very important. First of all, evaluation is concerned prL 
marily with educational effectii^eness (the degree to which it succeeds 
in doing whatever it is trying to do), whereas accountability experts 
are concerned with effectiveness ai^d efficiency (its capacity to achieve 
results with a given expenditure of resources), and very often they 
are more interested in the latter. Thus, while the evaluator's task is 
an extremely difficult one (some of the diffleultieB will be discussed in 
the next section of this paper), the educational accountant's role Is 
even more complex, for he not only attempts to determine what the 



7, /6ic2., p. B. 
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institution has done, but also how much it has cost to do it and, ul- 
timately, v/hether it was worth the cost. 

Of course, as Rourke and Brooks point out, efficiency and eflfective- 
ness are closely related, for how well an institution achieves its goals 
may depend largely on how well it has used its usually limited re- 
sources.^ But the two are often at odds, as demonstrated by the rather 
frequent clash between the college financial officer, who often tends 
to be oriented toward a criterion of efficiency and the faculty member 
who complains about the restraints being placed on his strivings for 
educational effectiveness. 

A second difference between evaluation and accountability has to 
do with the stimulus foi’ the study and who participates in the inquiry. 
Institutional evaluation has traditionally been an activity carried out 
as an ongoing function within the institution by members of ad- 
ministrative and faculty groups. The entire process of self-study has 
been viewed as one that would enable members of the staff to gain 
more Insights into their own strengths and weaknesses and thereby 
improve the educational, research, and service programs of the insti- 
tution. It is viewed, in other words, as an internal process having 
positive ends. Accountability, on the other hand, has brought with it 
the notion of external judgment. Judging, at least, from the reactions 
of many elementary and secondary school teachers, there is the clear 
indication that “accountability” Is regarded as a vindictive rather 
than an affirmative process. Someone not in the school itself is passing 
judgment on the quality of the performance of those who work there. 
Articles and papers making' a case for accountability often include 
such statements as ^^The professional educators who operate them 
[the schools] must be held responsible” and “The taxpayers are en- 
titled to know what they are getting.” As one teacher has remarked, 
“If we say that someone is accountable we usually mean that -he must 
suffer the consequences of his actions.’ We hardly ever mean the more 
positive 'he will profit from the consequences of his actions.’ 

Thougli there are other differences between evaluation and ac- 
countability (for instance, educational evaluators are often psycholo- 
gists or educational researchers, whereas educational accountants 
are more often economists or backgrounds in business and fl- 



8. Rourke and Brooks, op, cit. 

9. Barry R. McGhan, ^‘Accountability as a Negative Relnforcer,^^ AmeTtean Teacher, 
Vol, 6S, No. 8, November, 1970, p- IS. 
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nance), the differences between effectiveness and efficiency as the 
foens of the research and between the perceptions (accurate or not) of 
evaluation as a positive form of self=study and accountability as a 
retributive form of judgment by some external body seem to be the 
major distinguishing characteristics. 

Educational a^ "countability can and does take many forms. At the 
higher education level, two forms seem to be most likely to gain sup- 
port, The first is for higher educational institutions (or systems) to 
move toward Improved, output-oriented management methods^ al- 
ways with an eye toward efficiency. In many institutions, this has 
been the primary function of their offices of Institutional research for 
some years. The institutions perform their own self-study (as in 
evaluation), based on improved output-oriented management methods 
such as program budgeting (as opposed to straight line-item budget- 
ing), systems analysis, standardizing of forms for gathering basic 
institutional data and of routine computer programs to yield reports, 
and so forth. The institutions then make their own periodic reports 
to their relevant publics, for instance, their alumni or donors in the 
case of private institutions and the board of regents or statewide 
coordinating body in the case of public institutions. 

The second form of accountability that would seem to be viable in 
higher educational institutions is what Stephen Barro calls *‘institu- 
tionalizatlon of external evaluations or audits, In this account- 
ability system, assessments of efficiency and effectiveness would be 
made by some agency external to the institution, such as by a state- 
wide office of higher education- In this case, the institution's per- 
formance would be judged by direct comparison with others with the 
same flnancial base^ All data used for such comparisons would have 
to be objective and comparable among all Institutions, such data 
being gathered by the central agency by means of standard reporting 
routines and kept in a central data file for purposes of regular inter- 
institutional comparisons, 

A third form of accountability that might conceivably gain support 
among those passing judgment on the quality of an Institution’s ac- 
tivities is a performance incentive system for faculty members. 
Under this plan, salary increases, promotion, or other devices may be 
used as rewards for demonstrated quality performance by the 



10. Stephen M, Barro, "An Approach to Developing Accountability Measures for the 
Public Schools,” JPhi Delta Vol. LII, No. 4, December, 1970, pp. 196-206, 
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faculty.il Such an approach would bring the accountability notion 
right down to specific members of the faculty, whereas it is usually 
thought of as pertaining to the institutional or possibly departmental 
level. TTet, the current overabundance of Ph.D.s and scarcity of vacan- 
cies at the college level, combined with the growing insistence among 
students that they be allowed to rate theii' teachers, make it more 
likely that accountability at the individual teacher level may be 
forthcoming. 

There are other forms of educational accountability, but their 
appropriateness for higher education is questionable. Performance 
contracting (in which contracts are made with external agencies, 
usually private firms, to conduct specifled instructional activities 
presumably leading to agreed-upon, measurable results, such as a 
gain in scores of so many points on a standardized reading test) and 
alternative educational systems (also referred to as the “voucher 
systems, in which parents are given tuition vouchers and allowed to 
choose and pay for their children’s education at a school of their own 
choosing) ■these seem to be less suited for higher education, mainly 
because they are geared to an educational level at which there is 
rather wide agreement or consensus about the specific developmental 
skills (for instance, reading, writing) expected of its students. 

IV^a^ina^g^eniont Xnfox^xn^tiorL l§5r€itoxri3 

A central feature of accountability systems in higher education 

especially the external evaluation by a central agency — is the man- 
agement information system (MIS). The MIS is a system of information 
collection, storage, collating, and distribution that makes it possible 
to monitor routinely certain aspects of an institution's operations. 
A.t the heart of the Mils is a central pool of data, consisting of pieces 
of information comparable from one institution to another. Such a 
system makes interlnstitutional comparisons possible and meaning- 
fol, for the interpretations can be based on common data elements. 
One of the problems of making interinstitutional comparisons in the 
past has been that the information available has ‘not been exactly 
comparable. A full-time-equivalent student at one institution, for 
®3cample, has not necessarily been defined In the same "way as a full- 



11. Thou.g'h some higher educahonal Institiitioas have occasionally granted cash 
awards to faculty members voted as outstanding teachers by the students, such re- 
inforcement is usually available to so few that it can hardly be regarded as a bona 
fide performance incentive system as meant here. 

o 



time-equivalent student in another institution. And so on. These 
systems, in and of themselves, do not represent another form of edu- 
cational accounting or evaluation. They are an indispensable tool, 
tor the conduct of any form of interinstitutional compari- 
sons. 

A good example of an ]vris for higher education is the one developed 
by the Systems Research Group of Toronto. Ifno'wn by the acronym 
CAMPUS (for Comprehensive Analytical Methods for Planning Uni 
versity Systems), this iviis is designed to help colleges and universities 
* SS-i*^ the maximum, educational advantage from the resources which 
are put at their disposal.”*® CAMPUS focuses on basic operational data 
that are already available, in some form, on most campuses. By con- 
centrating on such basic pieces of information as student credit hours 
produced (by various academic levels), student enrollment (head 
counts), faculty teaching loads, and information regarding classroom 
space, tuition, and the like, campus is a good example of one way of 
i*Aproving resource allocations in higher education. The CAMPUS 
system, it should be noted, does not emphasize educational outputs, 
but rather resource allocation, mainly of a fiscal and physical facili- 
ties nature. It is a good example of an MIS designed to improve in- 
stitutional efficiency^ but, at least at the time of this writing, does 
not appear to be designed to offer college administrators a means of 
examining their effectiveness. 

A good example of a system being designed to assist institutions 
(or central agencies) in studying both efficiency and effectiveness is 
the MIS of the Western Interstate Commission for Higher Education 
(WICHE) in Boulder, Colorado. The WICHE people are interested not 
only in the costs of higher education and the best possible means of 
allocating scarce resources, but also hope to be able to answer the 
question *‘What are the outcomes [italics mine] and products that are 
produced by those programs and services?”*® The WICHE rationale is 
straightforward: “To examine the costs of educational programs 
with little or no evidence available related to the outputs of those 
programs offers relatively little advantage to educational deci- 



12. “The Development and Implementation of CAMPUS; A Computer-Based Plr.nnlng 
and Budgeting Information System for Univez’sitles and Colleges.” Toronto: Systems 
Orotip, August, 19TO, p* 2* 

Robert A. Deftnitiai% and MmaaarGrrierit of' the Ontcamms and Activities of Higher 

Education. Botilder, Colo-: Western Interstate Commission for Higher Education, 19T1, 

p- 1- 
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sion makers. The MIS program of WICHE is indeed ambitious, 
for it not only seeks to measure educational outputs and the extent 
to which higher educational institutions have influenced those out- 
puts, but it goes a step further and wishes to assign dollar signs to 
the outputs produced. Some of the difflculties in measuring institu- 
tional eflfectiveness or Impact are discussed in the following section. 



14. Ibid., p. 2. 
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Sonie of the Many P3?ot>lemg 
in Measurings Sduoatlonal Impact 

Educa.tiona.1 and psycholo^cal researchers have been investigating 
the area of college impact for years^ and the methodological problems 
they have confronted are by now well known to most students of 
higher education. These include (but are not restricted to) the prob- 
lems of defining and assessing institutional goals^ of relating college 
effects and college goals, and of how (and whether) to develop be= 
havioral objectives for educational instltutioris, the -^lack of variance” 
phenomenon, and the very difficult problems of inferring causal con- 
nections between inputs and outputs in naturalistic settings. Since 
educational accounting systems attempt to go further and develop 
ratings of institutional quality on the basis of some of these measures, 
further problems — ^particularly nontechnical problems of professional 
staff morale, interinstitutional competition, and the like — can also 
be expected to develop, but are beyond the purview of this paper. 

The Problem of Definixig and Asseeaing IxxstitutlonaJ Goals 

Many have been arguing’ for some time that any evaluation of an 
institution's effectiveness must take Into consideration the institu- 
tion^s goals. The problem, of course, is that too few institutions have 
really seriously considered what their goals are, and those that hsave 
often find that the various members of the college community dis= 
agree over what the purposes of the institution should be. It is in- 
teresting to note that the recent goals study conducted by Edward 
Gross and Paul Grambseh used an inventory consisting of 47 goal 
statements, only 17 of wbich dealt with “output*^ goads (teaching 
students, producing research, providing public service); the rest dealt 
with **support*’ goals, such as academic freedom, involving the faculty 
in governance of the institution, and so forth. 

Educational Testing Service (ETS) has been conducting various 
studies and literature reviews over the past two years to prepare for 
the construction of a goals inventory for institutions of higher educa- 
tion, At the time of this writing a preliminary Institutional Goals In- 
ventory (IGI) has been developed and is being ^^trled out^* and modified 
before being made available for institutional self-study. The pre- 
liminary form of the IGI contains 100 statements of plausible institu- 



15. Edward W. Gross and Paul V. Grambseh, Goals arid A^eadamic 

Washington, D,C.: American Council on Education, ld68» 164 pp. 
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tional goals (for instance, “to help students develop the ability to 
speak and write effectively,” “to strengthen the religious faith of 
students,” “to assist in efforts to achieve and maintain world peace”) 
to which the respondents — students, faculty, administrators, alumni, 
trustees, members of the immediate community, or whatever — indi- 
cate the extent to which they feel each statement is and should be 
a goal of the institution. Such an approach makes several things pos- 
sible. First, while it may be true that divergent groups will never see 
eye to eye on the major purposes of higher educational institutions, it 
will at least be possible to quantify the extent of their disagreement 
and account for it in subseouent studies. Second, the technique 
provides an interesting measure of discrepancy between what the 
relevant groups think is and should be highly valued in academia. 

"However, while instruments such as the one being developed by 
BTS should be helpful to colleges and universities trying to gain a 
better perspective on themselves and what they should be doing, the 
difficult <:ask of trying to assess whether or not they have achieved 
these goals has just begun. 

The Criterion Froblem. and Behavioral Objeetlves 
in .Assessing College Impact 

Most statements of educational goals — including those in the pre- 
liminary IGI described above — ^ are too general in nature to permit 
precise assessment of whether they have been achieved. How does 
one determine whether the institution has “prepared students for the 
duties and responsibilities of citizenship,” or “enabled students to 
develop a set of principles to guide their behavior,” or any of a whole 
series of similar statements that might be found in college catalogs? 
It was concerns such as these that led to a “movement” toward the 
development of “behavioral objectives” in education. Behavioral ob- 
jectives — which are essentially operational definitions are state- 
ments of specific educational objectives in terms of changed student 
behavior. Such statements lend themselves nicely to direct observa- 
tion and measurement. (The performance contracting form of educa- 
tional accounting referred to earlier in this paper relies heavily on 
behavioral objectives. The firms contract with school systems not to 
promote the general level of students’ reading ability but rather to 
improve the mean reading score of the class on such and such a 
test by X number of points.) Behavioral objectives, highly esteemed 
among edueatlonal evaluators for many years, have some serious 
shortcomings of their own, however. Not least among them stems 
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from their specificity^ a eharaeteristic which is at once an advantage 
and a shortcoming. Because they are highly speciflc, behavioral ob- 
jectives permit precise measurement- On the other hand, this small 
precision can be restrictive, in that other highly desirable educa- 
tional outcomes are omitted. In commenting on this disadvantage of 
behavioral objectives in the development of mathemutics tests, one 
test specialist has remarked: , the current statements of be- 

havioral objectives in mathematics for grades K-6 reveal a number of 
serious defects which would rightly prevent them from being ac- 
cepted by the mathematics community. The first of these defects 
seems to result from the energetic attempt to achieve great speci- 
fielty. The unfortunate consequence of this atomization is that ths 
interrelatedness of mathematical concepts is lost and the statement 
is a tedious list of very trivial low-level skills. . . . Besides the foregoing, 
another diflBeulty in ultimately stating all the objectives of mathe- 
matics instruction behaviorally arises in connection with the desire 
to develop in students the ability to do original thinking in novel 
situations. Presumably if these situations and these kinds of thinking 
were spelled out with the degree of specificity usually found in be- 
havioral objectives, the originality and the novelty would be lost and 
the objective would ^evaporate in clarity/ 

W^hile the previous criticisms have been directed to behavioral 
objectives as they relate to mathematics, teachers and testers in 
other fields are often even less sympathetic to the potential of be- 
havioral objectives- A spokesman for the humanltlef has chimed in: 
**This trend (toward the use of behavioral objectives in evaluating 
school performance) will most likely have disastrous effects on the 
teaching of English and other subjects in the humanities, for many 
goals in the humanities either do not naturally result in overt be- 
haviors or result in overt behaviors occurring so far away in time and 
space from the stimulus prasentation that for all practical purposas 
they are lost to evaluation and will never be counted-*’^^ 

It would be a shame indeed if educational institutions ware evalu- 
ated in terms of how well their students performed on measures of 
behavioral objectives that were employed in the first place because 
they could be measured! Such a situation Is much like that of the 
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proverbial tail wagging' the dog^ Gronbach has pointed out that 
specific behaviors can. and should be employed as indiaators of con= 
structs (for instance, self-confidence, scientific attitude} but not 
as the defirteT^a of those constructs* Oronbach argues that constructs 
ought to be the crucial aspect of the evaluation process, where 
constructs refer to a network of relations or characteristics, but 
not speeiflc incidents of behavior. Cronbach goes on to say that 
-‘The operationists who want to equate each construct with ‘one 
indicator^ , * . are advocating that we restrict deBcriptions to state- 
ments of tasks performed or behavior exhibited and are rejecting 
eonstru.et interpretations. * . . The writers on curriculum and evalua- 
tion who insist that objectives be ‘defined in terms of behavior' are 
taking an ultraoperationalist position, though they have not offered 
a scholarly philosophical analysis of the issue,"'® 

To use as definitions of educational goals — at any level of educa- 
tion — only measurable criteria will almost certainly result in a neat 
list of narrow and unimportant educational outcomes. Not to attempt 
to state educational objectives in some measurable way tempts edu- 
cators to rely on the sort of meaningless rhetoric that has charac- 
terized college catalogs for many years. The dilemma is a struggle 
between what Melvin Tumin calls “trivial precision and ct'p'pcbT^ntly 
rich ambiguity,'^'® and it is imperative that institutional adminis- 
trators and faculty members talk with the educational evaluators or 
“accountants" and attempt to strike a better balance between these 
two extremes. 

That much having been said, it is now just as important to point 
out that there are probably certain conaequences of higher educa- 
tion that will never be measured and perhaps are not measurable. 
Even after the strict operationalists with their behavioral objec- 
tives and the educational philosophers with their vague rhetoric 
agree on objectives that are broader in nature but still measurable, 
there will remain numerous Important educational outcomes that will 
never be measured in any effective way. Generally, these are the large 
questions such as “Is higher education really necessary?, “Are the 
taxpayers getting what they paid for from the publicly supported 
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institutions of Higher education?, ^^Are the educational needs of the 
state or region being satisfied?,*^ and so on. None of these questions^ 
at least as they are phrased here, can be answered by the most so= 
phisticated evaluation or educational accounting. At least not until 
each of these ^^large*’ questions is split Into a great many more 
*^specifle^* questions. This prc jess of ‘^clariflcation,’’ however, accord- 
ing to Tumin again, very often proves ^^to he one of selecting a very 
few of the irianjr constituent facets of those questions and focusing 
on those alone, hoping that those fragments will somehow ‘represent^ 
or ^stand for’ the large whole, such as is implied in -serving the needs’ 
or ^preparing the children,’ or other comparable ^holistic’ phrases. In 
short, \f eliahl^ mBCLSur&rri&Tits ave to 6c dcl7^<x?^c?ed, %t is i.ndis'p&nscLble 
that tHe ^whale* %m^cuct in wHich we ct^^e ctlwctys intevestedi be hi^aken 
i.nta fi^ajQments ^ and certain selected aspects of that *whole’ tahen 
under study, while the many other fi'agments and the ‘wholeness’ are 
once again put aside. 

This Bhould not be interpreted to mean that educational evaluators 
should despair of developing useful, reliable, comprehensive meas- 
ures of educational outcomes. Many have already been developed, 
and efforts to develop better ones should continue. But those who 
work on such problems should be guided by the realistic awareness 
that the ‘^large” questions regarding American higher education 
will probably not be answered through their efforts. 

The X^aok of Varlanee Problem 

and. fixe ^<Jeed for ^(Eultlple Criterion A/Ieasures 

Almost all proponents of educational accountability tend to favor a 
“value-added"’ concept. That is, institutions should be judged not by 
their outputs alone, but by their outputs relative to their inputs. The 
students’ final standing with regard to various characteristics would 
not be as Important as their oftcungeB Cusually gains) during the 
college years. A rather typical point of view is the following; “’What 
has the student attained in relation to his capability at the starting 
point? This concept approximates ednecLtiortcbl ^cblue-cbdded^ ... Accord- 
ing to this view, an educational process which moved the student from 
the lowest quartile of high-school achievement to the second quartile 
of college-gTaduate achievement would be accomplishing something 
tremendous, whereas the college which accepted students only from 
the top decile of high school achievement and delivered them into the 
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top decile of college achievement would be doing relatively much 
less.”®* 

Such a view— and again it should be emphasized that it is a view 
widely held — makes the assumption that educational institutions are 
potentially very powerful agents of change, capable of having a great 
deal of impact on both the cognitive and noncognitive attributes of all 
who pass through their doors. It is further assumed that colleges 
diflFer widely in the amount of impact they have. The accuracy of such 
a view, however, is highly questionable. Indeed, most of the evidence 
suggests that it is downright naive, for educational institutions at all 
levels appear to differ very little in terms of the amount of impact 
they have on their students after controls are made for general 
mental ability, socioeconomic status (SES), and other important back- 
ground factors outside the purview of the formal educational institu- 
tion. For example, numerous proponents of the “value-added” concept 
In educational accountability argue that one good criterion for in- 
stitutional quality would be their students* standing on standardized 
tests of educational “attainment,” after controls have been made for 
educational aptitude at the time of entry into college. Very often 
speciflc suggestions are made for use of one of the national college 
admissions tests (the Scholastic Aptitude Test of the College En- 
trance Examination Board or the tests of the American College 
Testing Pro^am) as the input measure and scores on one of the Area 
Tests of the Graduate Record Examinations (GRE) as the output 
measure,®® At first blush, such an approach seems quite sensible. 
The problem, however, Is that the correlation between college means 
on these measures is so high (often in the .90s) that there is generally 
very little variance left that the colleges can Influence. Obviously, the 
overlap between the input and output measure varies somewhat de- 
pending on the speciflc measures chosen for the study, but any two 
measures of academic aptitude or achievement (and the distinction 
between the two Is often very fuzzy indeed!) will correlate quite 
highly. This is generally referred to as the factor by psychologists, 
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reflecting’ the general nature of cognitive skills required on such tests. 
While there is some variance remaining (that is, some test perform- 
ance that cannot be attributed to this general factor), this portion of 
the variance can usually be best explained by dlflFerence j in SES. Only 
a tiny portion of differences in cognitive test scores remains that 
cannot be explained by one of these two factors. Assuming that the 
balance is all caused by differences in educational experiences (an 
unlikely assumption), the point is that there is precious little oppor- 
tunity for educational influences to be regarded as very important in 
explaining differences in student performance on such measures. This 
is not meant to suggest that formal education has no influence on its 
students. Notice that the comparison is always hetween institutions 
and seldom (if ever) based on a college versus no-college dichotomy. 
Colleges may have some influence, but the degree of their influence is 
almost indistinguishable froyn each other. This seems to be true not 
only in the area of cognitive traits, but for various noncognitive 
(for example, attitudes and values) traits as well. Researchers have 
been interested in the question of college Impacts on students’ atti- 
tudes and values for years, and have usually come to the conclusion 
that, while students definitely change during the college years, it is 
extremely difficult to associate those changes with colleges possessing 
certain characteristics. In the most comprehensive summary of col- 
lege-'impact research ever published, Feldman and Newcomb point 
out that **the degree and nature of different colleges’ impacts vary 
with their student inputs,” and later, “In the absence of more com- 
plete data, we offer it only as a likely hypothesis that those charac- 
teristics in which freshman-to-senior change is distinctive for a given 
college will also have been distinctive for its entering freshmen, , . , 
[their italics]”*® 

Part of the difficulty in discovering differential cognitive impact of 
educational institutions may be attributable to a lock-step method- 
ology that is clouding real impact differences. Given the nature of 
most tests of cognitive attributes used in such research, it probably 
shouldn’t be too surprising that they do not turn up large educational 
differences. These tests are almost always constructed so as to be 
widely appropriate and sufficiently general in nature to ensure their 
appropriateness for many educational experiences. Yet herein lies 
part of the evaluative problem. Criterion measures designed to be 
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broadly applicable may well be too general in nature to measure the 
specific outcomes of educational experiences at a local level. Educa- 
tional evaluators may have to turn, instead, to achievement examina- 
tions geared especially to syllabi used in specific college courses if 
they are to turn up indexes of college effects. Such a procedure makes 
it difficult, however, to conduct interinstitutional comparisons, often 
felt to be the central and most important feature of educational 
accounting systems. Thus, there is a return to the problems suggested 
earlier: measures of a general nature yield little or no interinstitu- 
tional variation, while measures geared to the program of a specific 
department or institution do not allow for multicollege comparisons. 
Yet, the interinstitutional comparisons are useless if they fail to 
reveal meaningful differences, and so the specifically designed cri- 
terion measures may be the only reasonable solution. 

Reliance on a far greater variety of criterion measures (outcomes 
measures) would also seem to be desirable. This is particularly true 
during a period of what seems to border on universal higher educa- 
tion. With students of varying backgrounds, skills, interests, and 
objectives attending institutions of higher education, it seems im- 
perative that the institution begin to examine criteria other than 
some form of ‘‘intellectuality,” which, like it or not, can no longer be 
regarded as the primary purpose of most higher educational institu- 
tions. 

As with other aspects of the educational evaluation paradigm, how- 
ever, it is easy to talk about the need for a variety of criterion meas- 
ures and much harder to come up with them. Social conscience, 
heightened awareness, various kinds of “appreciation,” attitudes and 
values, citizenship, moral sensitivity— all these and more have been 
mentioned as projected outcomes of certain colleges. Measures of 
t\iese variables will surely not be a simple task, but there is some rea- 
soii for optimism. As long as it is remembered that sUch measures 
would serve as indicators (and not deflners) of desired educational 
constructs, the development of the inventories and materials would 
be a difficult, time-consuming, expensive, but definitely possible and 
worthwhile task. 

The Problem of Inferring Effects in Naturalistic Settings 

In order to use output measures of student performance to compare 
the effectiveness of educational programs, adjustments must be made 
for preexisting differences among the groups. These adjustments are 
the crux of the “value-added” concept discussed earlier. Unfortu- 
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nately, there is no guarantee that any of the frequently used means 
of making adjustments such as matching, using diflference scores, 
analysis of covariance, or other regression techniques will result in an 
appropriate adjustment. As stated by Lord, . . there simply is no 
logical or statistical procedure that can be counted on to make proper 
allowances for uncontrolled preexisting differences between groups,”®'* 

There are two major aspects to the problem of making adjustments: 
(1) the identification of all the relevant variables for which adjust- 
ments are needed, and, (2) the estimation of the magnitude of the 
adjustment that should be made for the variables once they are 
identified. It seems clear that allowances should be made for differ- 
ences in student aptitudes at time of entrance into the program. 
Certain background characteristics such as SES are also natural 
candidates. However, there are many other potentially important 
differences among entering students that are typically ignored or not 
thought of (for instance, motivation, sex, age). Adjustments also are 
needed for institutional characteristics that cannot be controlled by 
the institution. 

Given a set of variables for which adjustments are desired, there 
remain several sources of error that can result in biased adjustments. 
Specification errors and errors of measurement can both bias the 
comparisons of preexisting groups. The failure to include a variable 
in the model that is related either to the output or other control 
variables and on which there are preexisting differences among 
groups would be a specification error that would result in bias. 
Similarly, unreliability in the control variables will result in biased 
adjustments when the groups differ on these variables initially. As 
Astln points out, the most likely result of these shortcomings is to 
misleadingly indicate college effects when, in fact, there may be none.®® 
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Conclusions 



These problems suggest that evaluating differential college impact 
may not be possible at all or, at best, that it will be some time before 
it can be done very well. The real difficulty is not so much in develop- 
ing new, reliable, relevant criterion measures. That will he difficult, 
of course, but certainly no insurmountable task. The problem will be 
in demonstrating differential college effects on these various criteria. 
Obviously, criteria that do not yield meaningful between-college 
differences in Institutional effects will not be useful for evaluating 
the effectiveness of those institutions. 

For this reason, it might make sense to begin at the beginning and 
help institutions do better in the area of institutional efficiency. Im- 
mediate attention to the development of management information 
systems that would permit college administrators to base everyday 
administrative decisions on continually updated facts about the 
institution would be a welcome service, and one that could be done 
rather soon. Forecasting detailed space requirements, calculating the 
number of faculty members needed for different enrollments, show- 
ing how operating costs would increase or decrease with a change in 
certain class scheduling techniques, considering alternative staffing 
policies on such matters as teaching loads, tenure, and the like — all 
these very important aspects of institutional functioning could be 
based on facts routinely gathered and summarized, if only more in- 
stitutions knew how to do it. MIS specialists could do higher education 
a great service in this area of educational efficiency. 

While that is being done, other specialists could continue to grapple 
with the problems of assessing the outcomes of higher education. It 
would indeed be unfortunate to turn all our attention to the area of 
educational efficiency, and ignore the question of college impact, thus 
taking part in what Selznick calls the “cult of efficiency,” which over- 
stresses means and totally neglects ends.®® But the question is 
whether, given the limitations outlined earlier, it makes sense to hold 
institutions “accountable'* for their effectiveness just yet, and 
whether the efficiency of operations couldn’t be vastly improved while 
the effectiveness question is being considered. 

In any event, whether dealing with operational efficiency or educa- 
tional effectiveness, it would be well to remember that education is a 
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social process and will inevitably resist simplistic evaluations of its 
results. As Henry Dyer has said: 

“The term educational accountability, as used most recently by 
certain economists, systems analysts, and the like, has frequently 
been based on a conceptualization that tends, by analogy, to equate 
the educational process with the type of engineering process that 
applies to industrial production. ... It must be constantly kept in 
mind that the educational process is not on all fours with an indus- 
trial process; it is a social process in which human beings are con- 
tinually interacting with other human beings in ways that are im- 
perfectly measurable or predictable. Education does not deal with 
inert raw materials, but with living minds that are instinctively con- 
cerned first with preserving their own integrity and second with 
reaching a meanin^ul accommodation with the world around them. 
The output of the educational process is never a ‘finished product' 
whose characteristics can be rigorously specified in advance; it is an 
individual who is sufficiently aware of his own incompleteness to 
make him want to keep on growing and learning and trying to solve 
the riddle of his own existence in a world that neither he nor anyone 
else can fully understand or predict.”^^ 

Perhaps more than all the limitations discussed earlier in this 
paper, Dyer's analysis serves to emphasize that the problems in- 
volved in assessing institutional effectiveness and developing ob- 
jective criteria for accountability will continue to be hard problems. 
They are precisely the problems, however, that must be tackled with 
the best people and the best methods available if higher education 
is going to serve us well. 
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